Search CORE

6 research outputs found

What the Vec? Towards Probabilistically Grounded Embeddings

Author: Allen Carl
Balazevic Ivana
Hospedales Timothy
Publication venue
Publication date: 11/11/2019
Field of study

Word2Vec (W2V) and GloVe are popular, fast and efficient word embedding algorithms. Their embeddings are widely used and perform well on a variety of natural language processing tasks. Moreover, W2V has recently been adopted in the field of graph embedding, where it underpins several leading algorithms. However, despite their ubiquity and relatively simple model architecture, a theoretical understanding of what the embedding parameters of W2V and GloVe learn and why that is useful in downstream tasks has been lacking. We show that different interactions between PMI vectors reflect semantic word relationships, such as similarity and paraphrasing, that are encoded in low dimensional word embeddings under a suitable projection, theoretically explaining why embeddings of W2V and GloVe work. As a consequence, we also reveal an interesting mathematical interconnection between the considered semantic relationships themselves.Comment: Advances in Neural Information Processing, 201

arXiv.org e-Print Archive

Edinburgh Research Explorer

TuckER: Tensor Factorization for Knowledge Graph Completion

Author: Allen Carl
Balazevic Ivana
Hospedales Timothy
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study

Knowledge graphs are structured representations of real world facts. However, they typically contain only a small subset of all possible facts. Link prediction is a task of inferring missing facts based on existing ones. We propose TuckER, a relatively straightforward but powerful linear model based on Tucker decomposition of the binary tensor representation of knowledge graph triples. TuckER outperforms previous state-of-the-art models across standard link prediction datasets, acting as a strong baseline for more elaborate models. We show that TuckER is a fully expressive model, derive sufficient bounds on its embedding dimensionalities and demonstrate that several previously introduced linear models can be viewed as special cases of TuckER

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Interpreting Knowlege Graph Relation Representation From Word Embeddings

Author: Allen Carl
Balazevic Ivana
Hospedales Timothy
Publication venue
Publication date: 18/01/2021
Field of study

Many models learn representations of knowledge graph data by exploiting its low-rank latent structure, encoding known relations between entities and enabling unknown facts to be inferred. To predict whether a relation holds between entities, embeddings are typically compared in the latent space following a relation-specific mapping. Whilst their predictive performance has steadily improved, how such models capture the underlying latent structure of semantic information remains unexplained. Building on recent theoretical understanding of word embeddings, we categorise knowledge graph relations into three types and for each derive explicit requirements of their representations. We show that empirical properties of relation representations and the relative performance of leading knowledge graph representation methods are justified by our analysis

arXiv.org e-Print Archive

Edinburgh Research Explorer

Hypernetwork Knowledge Graph Embeddings

Author: Allen Carl
Balazevic Ivana
Hospedales Timothy M.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 15/07/2019
Field of study

Knowledge graphs are graphical representations of large databases of facts, which typically suffer from incompleteness. Inferring missing relations (links) between entities (nodes) is the task of link prediction. A recent state-of-the-art approach to link prediction, ConvE, implements a convolutional neural network to extract features from concatenated subject and relation vectors. Whilst results are impressive, the method is unintuitive and poorly understood. We propose a hypernetwork architecture that generates simplified relation-specific convolutional filters that (i) outperforms ConvE and all previous approaches across standard datasets; and (ii) can be framed as tensor factorization and thus set within a well established family of factorization models for link prediction. We thus demonstrate that convolution simply offers a convenient computational means of introducing sparsity and parameter tying to find an effective trade-off between non-linear expressiveness and the number of parameters to learn

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models

Author: Balazevic Ivana
Logan Robert L
Petroni Fabio
Riedel Sebastian
Singh Sameer
Wallace Eric
Publication venue: ASSOC COMPUTATIONAL LINGUISTICS-ACL
Publication date: 01/07/2021
Field of study

Prompting language models (LMs) with training examples and task descriptions has been seen as critical to recent successes in few-shot learning. In this work, we show that finetuning LMs in the few-shot setting can considerably reduce the need for prompt engineering. In fact, one can use null prompts, prompts that contain neither task-specific templates nor training examples, and achieve competitive accuracy to manually-tuned prompts across a wide range of tasks. While finetuning LMs does introduce new parameters for each downstream task, we show that this memory overhead can be substantially reduced-finetuning only the bias terms can achieve comparable or better accuracy than standard finetuning while only updating 0.1% of the parameters. All in all, we recommend finetuning LMs for few-shot learning as it is more accurate, has relatively stable performance across different prompts, and can be made nearly as efficient as using frozen LMs

arXiv.org e-Print Archive

UCL Discovery